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Abstract 

Background: Hafnia alvei is an opportunistic pathogen involved in various types of nosocomical infections. 
The species has been found to inhabit food and mammalian guts. However, its status as an enteropathogen, and 
whether the food-inhabiting strains could be a source of gastrointestinal infection remains obscure. In this report 
we present a draft genome of H. alvei strain FB1 isolated from fish paste meatball, a food popular among Malaysian 
and Chinese populations. The data was generated on the lllumina MiSeq platform. 

Results: A comparative study was carried out on FB1 against two other previously sequenced H. alvei genomes. 
Several gene clusters putatively involved in survival and pathogenesis of H. alvei FB1 in food and gut environment 
were characterised in this study. These include the widespread colonisation island (WCI), the tad locus that is 
known to play an essential role in biofilm formation, a eut operon that might contribute to advantage in nutrient 
acquisition in gut environment, and genes responsible for siderophore production This features enable the bacteria 
to successful colonise in the host gut environment. 

Conclusion: With the whole genome data of H. alvei FB1 presented in this study, we hope to provide an insight 
into future studies on this candidate of enteropathogen by looking into the possible mechanisms employed to 
survive stresses and gain advantage in competitions, which eventually leads to successful colonisation and 
pathogenesis. This is to serve as the basis for more effective clinical diagnosis and treatment. 

Keywords: Hafnia alvei, Gut pathogen, Widespread Colonisation Island, tad, Ethanolamine utilisation, eut, Siderophore, 
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Background 

Hafnia alvei is a flagellated, motile, facultative anaerobic 
opportunistic pathogen of the Enterobacteriaceae family, 
which is also known to play a role in microbial food spoil- 
age [1]. This species has been isolated from a wide range 
of nosocomical infections, including septicaemia, as well 
as respiratory, enteric, and urinary tract infections [2,3]. 
Apart from that, H. alvei has also been commonly found 
to be present in abundance within communities of A/-acyl 
homoserine lactone (AHL) -producing food spoilers [1,4]. 

Although H. alvei has been known to inhabit gastro- 
intestinal tracts of various animal species, its status as 
an enteropathogen remains disputable. Clinical cases 
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associate with H. alvei have been most intensively re- 
ported in the 1990s. However, solid evidence supporting 
the fact of the species being the sole cause of gastro- 
intestinal infection, and whether it could be acquired via 
food is yet to be found. Several groups have attempted 
to investigate the possible pathogenesis pathways of H. 
alvei via biochemical and in vitro approaches [5-7]. 
However, the molecular basis of the mechanisms has not 
yet been demonstrated. 

In this study, we sequenced the genome of H. alvei 
FBI isolated from fish paste meatballs, a food made of 
fish paste popular among Southern and overseas Chin- 
ese communities. The processes of mashing and mixing 
in the making of fish paste meatballs brought the ingre- 
dients into frequent contacts with food processing sur- 
faces. Along the way, bacterial cells detached from the 
biofilm-contaminated surfaces could become entrapped 
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and immobilised within the food matrices. H. alvei has 
been reported to be able to survive temperature as low 
as 0.2°C [8]. The ability of H. alvei to form biofilm [9] 
and the connection of the trait to the concentration of 
AHLs have made it an interesting subject of study in 
controlling chronic contamination in food industry [10]. 
It is of importance to find out that if this common mi- 
crobial contaminant of food could also be a source of 
gut infection. 

Advancement in the technology of next generation se- 
quencing and availability of powerful bioinformatics 
pipelines enable bacterial genomes to be explored with 
much ease. To date, only two other H. alvei genomes 
have been sequenced, one being strain ATCC 51873, iso- 
lated from gut; and the other BIDMC 31, as a part of a 
study on carbapenem resistance, isolated from unspecified 
clinical source (http://www.ncbi.nlm.nih.gov/genome). Here 
we looked into the putative means of survival and pathogen- 
esis of H. alvei as a candidate of food and gut pathogen in a 
comparative genomics perspective. This is hoped to provide 
an insight into the molecular diversity of the species through 
comparison between the strain originated from food and 
those from guts in order to provide a basis for more in 
depth investigation in the future. 

Methods 

Bacteria culture 

H. alvei FBI was among the four bacterial species isolated 
in April 2013 from a packet of vacuum-packed fish paste 
meatballs sold in local supermarket. The sample was 
spread on MacConkey agar (MAC) plates for selective 
and differential purposed. Single colonies were picked and 
sub-cultured for at least two times to ensure the purity of 
each isolate. The isolates were identified via Microflex 
MALDI Biotyper system (Bruker, Germany) and 16S 
rDNA PCR prior to sequencing. The identified strains 
were maintained routinely on Luria-Bertani (LB) agar 
plates (Scharlau, Germany) at 37°C. 

Genomic DNA extraction 

Genomic DNA was extracted from overnight liquid cul- 
ture with MasterPure™ DNA Purification Kit (Epicentre, 
USA) according to the protocol provided by the manufac- 
turer. Routine quantification was performed on Qubif2.0 
Fluorometer with dsDNA High Sensitivity Assay Kit (Invi- 
trogen, USA); whereas quality assessment with NanoDrop 
2000 Spectrophotometer (Thermo Scientific, USA) and 
gel electrophoresis. DNA samples were normalised into 
concentration of 1.8 ng/ul prior to library preparation. 

Library preparation for genome sequencing 

Sequencing template was prepared with Nextera DNA 
Sample Preparation Kit (Illumina, USA). Quality check- 
ing on the prepared library was performed using Agilent 



2100 Bioanalyzer High Sensitivity DNA Kit (Agilent 
Technologies, Canada). Ten picomolar (10 pM) of de- 
natured DNA library was loaded into the sequencing 
cartridge can sequenced on Illumina MiSeq platform. 

Assembly and annotation 

Quality assessment, trimming and assembly of the se- 
quencing reads were performed using CLC Genomic 
Workbench 6 (http://www.clcbio.com). Raw reads were 
trimmed at Phred 30 and de novo assembled into 39 
contigs. Assembled sequences were then annotated using 
RAST (Rapid Annotation using Subsystem Technology) 
pipeline [11]. 

Genome comparison and phylogenetic analysis 

A whole-genome-based phylogenetic tree was constructed 
by means of Composition Vector Tree (CVTree) version 
2 [12]; while a sequence-based genome comparison was 
performed with RAST. The choices of organisms to be in- 
cluded were made according to the list of closest neigh- 
bours' presented by RAST. The whole genome sequence 
(WGS) data was obtained from the NCBI database. 

Quality assurance 

The 16S rDNA gene was extracted from the draft gen- 
ome using RNAmmer 1.2 server [13]. A single copy was 
detected. A BLAST annotation against NCBI microbial 
16S database has confirmed that it belongs to H. alvei. 

Initial findings 

The basic statistics of this draft genome are summarised 
in Table 1. Sixty percents of the 4,239 protein-coding se- 
quences were categorised into 548 subsystems. Figure 1 
is an overview of subsystem distribution in FBI along 
with two other previously sequenced H. alvei genomes 
generated by RAST. 

The result of phylogenetic analysis, displayed as a 
Neighbour-joining tree in Figure 2, showed the probable 
evolutionary relatedness of three H. alvei strains and 
other selected organisms (genera Edwardsiella, Serratia, 

Table 1 List of genome statistics 



Attribute Value 



Genome size 


4,650,601 bps 


No. of contigs 


39 


Minimum length of contigs 


1,067 bps 


Average coverage 


119.1x 


N50 


256,161 bps 


G + C content 


48.90% 


No. of subsystems 


548 


No. of RNAs 


86 


No. of CDS 


4,239 
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Subsystem Coverage 



Subsystem Category Distribution 




FB1 



Subsystem Feature Counts 

Cofactors, Vitamins,, Prosthetic Groups, Pigments (319) 

Cell Wall and Capsule (144) 

Virulence, Disease and Defense (99) 

Potassium metabolism (32) 

Photosynthesis (0) 

Miscellaneous (51) 

Phages, Prophages, Transposable elements, Plasmids (90) 

Membrane Transport (179) 

Fran acquisition and metabolism (60) 

RNA Metabolism (215) 

Nucleosides and Nucleotides (128) 

Protein Metabolism (275) 

Cell Division and Cell Cycle (37) 

Motility and Chemotaxis (140) 

Regulation and Cell signaling (124) 

Secondary Metabolism (6) 

DNA Metabolism (128) 

Regulons (9) 

Fatty Acids, Lipids, and Isoprenoids (122) 

Nitrogen Metabolism (46) 

Dormancy and Sporulation (3) 

Respiration (160) 

Stress Response (152) 

Metabolism of Aromatic Compounds (32) 

Amino Acids and Derivatives (441) 

Sulfur Metabolism (46) 

Phosphorus Metabolism (59) 

Carbohydrates (563) 



Subsystem Coverage 



Subsystem Category Distribution 




ATCC 51873 



Subsystem Coverage 



Subsystem Category Distribution 




Subsystem Feature Counts 

Cofactors, Vitamins, Prosthetic Groups, Pigments (323) 

Cell Wall and Capsule (146) 

Virulence, Disease and Defense (97) 

Potassium metabolism (36) 

Photosynthesis (0) 

Miscellaneous (56) 

Phages, Prophages, Transposable elements, Plasmids (73) 

Membrane Transport (188) 

Iron acquisition and metabolism (54) 

RNA Metabolism (216) 

Nucleosides and Nucleotides (128) 

Protein Metabolism (258) 

Cell Division and Cell Cycle (39) 

Motility and Chemotaxis (140) 

Regulation and Cell signaling (113) 

Secondary Metabolism (5) 

DNA Metabolism (135) 

Regulons (9) 

Fatty Acids, Lipids, and Isoprenoids (124) 

Nitrogen Metabolism (48) 

Dormancy and Sporulation (3) 

Respiration (167) 

Stress Response (153) 

Metabolism of Aromatic Compounds (32) 

Amino Acids and Derivatives (422) 

Sulfur Metabolism (36) 

Phosphorus Metabolism (60) 

Carbohydrates (635) 



Subsystem Feature Counts 

Cofactors, Vitamins, Prosthetic Groups, Pigments (359) 

Cell Wall and Capsule (214) 

Virulence, Disease and Defense (144) 

Potassium metabolism (34) 

Photosynthesis (0) 

Miscellaneous (72) 

Phages, Prophages, Transposable elements, Plasmids (36) 

Membrane Transport (202) 

Iron acquisition and metabolism (74) 

RNA Metabolism (248) 

Nucleosides and Nucleotides (142) 

Protein Metabolism (293) 

Cell Division and Cell Cycle (43) 

Motility and Chemotaxis (9) 

Regulation and Cell signaling (169) 

Secondary Metabolism (17) 

DNA Metabolism (126) 

Regulons (7) 

Fatty Acids, Lipids, and Isoprenoids (141) 

Nitrogen Metabolism (46) 

Dormancy and Sporulation (5) 

Respiration (180) 

Stress Response (182) 

Metabolism of Aromatic Compounds (79) 

Amino Acids and Derivatives (553) 

Sulfur Metabolism (83) 

Phosphorus Metabolism (66) 

Carbohydrates (874) 



BIDMC31 



Figure 1 Subsystem category distribution statistics for H. alvei. The pie charts of three H. olvei strains are presented side by side to give an 
overview of the subsystem coverage and the counts of each subsystem feature. 
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-Hafnia alvei ATCC 51873 



I Hafhia alvei FB1 

Yersinia ruckeri ATCC 29473 

Yersinia bercovieri ATCC 43879 

- Yersinia kristensenii ATCC 33638 
Yersinia rohdei ATCC 43380 



- Edwardsiella ictaluri ATCC 33202 
Serratia marcescens BIDMC 44 



- Serratia odorifera DSM 4582 



- Yersinia frederiksenii ATCC 33641 



- Edwardsiella tarda ATCC 15947 

I Klebsiella pneumoniae ATCC 25955 

I Klebsiella pneumoniae RYC492 

— Hafnia alvei BIDMC 31 



Figure 2 Phylogenetic analysis on H. alvei FBI genome. The Neighbour-joining tree was constructed based on the whole genome DNA 

sequences with K-tuple length of 6. 
k J 



and Yersinia were among the closest neighbours pre- 
sented by RAST). It was found that FBI was most re- 
lated to ATCC 51873; BIDMC 31 was, however, grouped 
closer to Klebsiella pneumoniae - a result consistent 
with that of 16S rDNA identification of the strain. 

A sequence-based comparison between the two H. alvei 
genomes and FBI was performed, and the result was pre- 
sented as a colour-labelled circular map in Figure 3. It was 
shown that ATCC 51873 displayed a higher level of se- 
quence similarity to FBI (visualised in green). Gaps on the 
map indicated the presence of genes in FBI genome that 
did not have a match in the compared genomes. The 
intra-species difference between FBI and ATCC 51873 
marked mainly by the presence of various phage-related 
genes. Two relatively large gaps observed in the BIDMC 
31 circle reflected the lack of gene clusters involved in fla- 
gella biosynthesis. The genomic data strongly suggests 
that BIDMC 31 lacks motility - a trait that resembles 
K. pneumoniae. From Figure 1 we see that only nine 
out of 4,398 CDS covered in subsystems contribute to 
motility and chemotaxis in contrast to 140 in both FBI 
and ATCC 51873. 

The Widespread Colonisation Island (WCI) was found 
to present in both FBI and gut-inhabiting ATCC 51873 
but absent in BIDMC 31. The tad (right <2<ihererence) genes 
that make up the Widespread Colonisation Island (WCI) 
play an essential role in biofilm formation, colonization and 
pathogenesis in a number of genera of bacteria and archea 
through the formation of Flp (fimbrial low-molecular- 
weight protein) pili for adherence [14]. The ability of bac- 
teria to form biofilm has been known to provide them with 



resistance against physical, chemical, as well as the gastric 
stress, of which their planktonic counterparts lack [15]. In 
this draft genome, the 12 tad genes necessary in forming all 
the adherence related phenotypes were found in Contig 
18 (Figure 4). Similar orientations were also observed 
in ATCC 51873. 

Our data also shows that Contig 18 of H. alvei FBI 
draft genome contained an £thanolamine £Mlisation 
(eut) operon that possibly contributes to the thriving of 
H. alvei in the gastrointestinal environment. The eut op- 
eron provides the bacteria their ability to utilise ethanol- 
amine, a form of molecules present in abundance in 
host intestine, as a sole source of energy. The one found 
in H. alvei FBI is the typical long operon of the Enter o- 
bacteriaceae family (Figure 4). The presence of a eut op- 
eron guarantees the successful survival and colonisation 
of H. alvei FBI in the intestinal environment. There are 
some hypotheses suggesting that the presence of eut op- 
eron indicated a role in pathogenesis, as the breakdown 
of phosphoethanolamine in the epithelial cell mem- 
branes could disrupt normal gut functions [16]. Interest- 
ingly, sequence-based comparison performed showed 
that this operon was also found in BIDMC 31, but not 
ATCC 51873. Instead, the latter possessed a paralogous 
propanediol utilisation (pdu) operon, which is also 
present in BIDMC 31 on a separate contig. The functions 
of both operons involve formation of proteinaceous poly- 
hedral microcompartments in which the entire metabolic 
processes, i.e., ethanolamine and propanediol metabolisms 
take place [17], and the sequence homologies were seen to 
present in the genes that contribute to the formation of 



Tan et al. Gut Pathogens 2014, 6:29 
http://www.gutpathogens.eom/content/6/1/29 



Page 5 of 7 



Phage genes 



Phage genes Phage genes 



tad&eut 

r 

operons 



Flagella^, 

-related 

genes 



yeeil, yeeV X 
putative 
toxin-antitoxin 
genes 



^ Flagella 
-related 
genes 



Phage genes 
^ rRNA genes 
Phage genes 



^ Phage genes 

Phage genes 

Percent protein sequence identity 



100 99.9 99.8 99.5 99 


98 95 90 80 70 


60 


50 


40 


30 


20 


10 


100 99.9 99.8 99.5 99 


98 95 90 80 70 


60 


50 


40 


30 


20 


10 



Figure 3 Genome sequence comparison of H. alvei ATCC 51873 (outer) and H. alvei BIDMC 31 (inner). Level of similarity is indicated by 
the intensity of colour. Regions containing sequences that do not have a match in the compared genomes were presented as 'gaps'. 

v J 



polyhedral-body-like. Both operons have been reported to 
associate with survival and the expression of global viru- 
lence regulators based on a well studied example Salmon- 
ella Typhimurium [15]. 

Previous reports have suggested the role of horizontal 
gene transfer in the current distribution pattern of tad 
and eut operons across species. Phylogenetic analysis by 
Planet et al. revealed that horizontal gene transfer had 
been a common event along the evolutionary history of 
this gene cluster [16]; whereas Tsoy et al. have discussed 
the possibility of genes in eut and pdu operons being ac- 
quired separately from different origins [18]. The possibil- 
ity of horizontal gene transfer and the diversity between 
closely related organisms suggest that these operons could 
be providing certain forms of selective advantage to the 
species. The close proximity of the two operons leads to 
the speculation on the possibility of a collective role of the 
operons in pathogenesis. More studies need to be done 
to explore the complicated evolutionary events that 
occurred. 



In addition to this, our genome analysis also showed the 
presence of genes important for iron uptake, another indi- 
cation of virulence of H. alvei [19]. H. alvei has been re- 
ported to produce siderophore that is 'neither aerobactin 
nor enterobactin by Podschun et al. [20] In Contig 6 of 
this draft genome we found a cluster of four genes pos- 
sibly involves in the siderophore biosynthesis pathway 
similar to that of aerobactin (Figure 5). A complete set of 
fhu operon (fhuABCD) is also present, indicating that H. 
alvei FBI is able to utilise the self-produced ferrochrome 
molecules as well as scavenge those available from the sur- 
roundings. The ability of siderophore production and up- 
take has been linked to virulence regulations and survival 
in an iron-deficient mammalian host environment [21]. 

Future directions 

The presences of the said gene clusters putatively secure 
H. alvei FBI's way through the harsh environments to- 
wards the host gut and ensure its advantage in the inter- 
species competition in the gut environment. Differences 
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Figure 4 Orientation of tad, eut, and pcfu operons in H. alvei FBI, ATCC 51873, and BIDMC 31. tad operon: 1. Type IV prepilin peptidase 
TadV/CpaA; 2. Flp pilus assembly protein RcpC/CpaB; 3. Type ll/IV secretion system secretin RcpA/CpaC; 4. Type ll/IV secretion system ATPase 
TadZ/CpaE; 5. Type ll/IV secretion system hydrolase TadA/VirB/CpaF; 6. Flp pilus assembly protein TadB; 7. Type ll/IV secretion system protein 
TadC; 8. Flp pilus assembly protein TadD; 9. Flp pilus assembly membrane protein TadE; 10. Flp pilus assembly surface protein TadF; 1 1. Flp pilus 
assembly protein TadG. eut operon: 1. Ethanolamine utilisation polyhedral -body-like protein EutS; 2. Ethanolamine utilisation protein EutP; 
3. Ethanolamine utilisation protein EutQ; 4. ATP:Cob(l)alamin adenosyltransferase; 5. Phosphate acetyltransferase; 6. Ethanolamine utilisation 
polyhedral-body-like protein EutM; 7. Ethanolamine utilisation polyhedral-body-like protein EutN; 8. Acetaldehyde dehydrogenase; 9. Ethanolamine 
utilisation protein EutJ; 10. Ethanolamine utilisation protein EutG; 11. Ethanolamine permease; 12. Ethanolamine utilisation protein EutA; 13. Ethanolamine 
ammonia-lyase heavy chain EutB; 14. Ethanolamine ammonia-lyase light chain EutC; 1 5. Ethanolamine utilisation polyhedral-body-like protein EutL; 
16. Ethanolamine utilisation polyhedral-body-like protein EutK; 17. Cob(lll)alamin reductase; 18. PduT; 19. Ethanolamine operon regulatory 
protein EutR. pdu operon: 1. Propanediol utilisation transcriptional activator; 2. Propanediol utilisation polyhedral-body-like protein PduA; 
3. Propanediol utilisation polyhedral-body-like protein PduB; 4. Propanediol dehydratase large subunit; 5. Propanediol dehydratase medium 
subunit; 6. Propanediol dehydratase small subunit; 7. Propanediol dehydratase reactivation factor large subunit; 8. Propanediol dehydratase 
reactivation factor small subunit; 9. Propanediol utilisation polyhedral-body-like protein PduJ; 10. Propanediol utilisation polyhedral-body-like 
protein PduK; 1 1. Propanediol utilisation protein PduL; 12. Propanediol utilisation protein PduM; 13. Propanediol utilisation polyhedral-body-like protein 
PduN; 14. ATP: Cob(l)alamin adenosyltransferase; 15. CoA-acylating propionaldehyde dehydrogenase; 16. Putative iron-containing NADPH-dependent 
propanol dehydrogenase; 17. Cob(lll)alamin reductase; 18. Propanediol utilisation polyhedral-body-like protein PduT; 19. Propanediol utilisation 
polyhedral-body-like protein PduU; 20. Propanediol utilisation protein PduV; 21. Propionate kinase; 22. Propanediol diffusion facilitator. 



were observed between the different strains of H. alvei. 
However, there was limited data available to perform a 
comparison that is able to show distinguished evolution- 
ary tracks adapted by members of the species inhabiting 
different environments. With the genomic data available, 
our future study will be focusing on validation of the 
role of tad operon on biofilm formation and the presence 
of regulatory role of eut operon on the adherence trait of H. 
alvei via mutant, cell culture and transcriptomic approaches 



in order to gain a better understanding on the behav- 
ior of FBI in response to stresses and changes in 
environment. 

Availability of supporting data 

This whole genome shotgun project has been depos- 
ited at DDBJ/EMBL/GenBank under the accession 
JCKH01000000. The version described in this paper 
is version JCKH01000000. 



H. alvei 
ATCC 51873 



Yersinia pestis | 
KIM 



Chromohaiobacter 
salexigens 
DSM 3043 

Thermobt'spora 
bispora 
DSM 43833 




Figure 5 Orientation of the siderophore gene cluster in H. alvei FBI and that in closely related organisms. 1 (short): Desferrioxamine E 
biosynthesis protein, DesC, siderophore synthetase small component, acetyltransferase (AlcB homologue); 1 (long): Desferrioxamine E biosynthesis 
protein, DesD, siderophore synthetase component, ligase (lucA, lucC homologue); 2: Siderophore [alcaligin-like] biosynthetic enzyme, monooxygenase 
(lucD homologue); 3: Siderophore [alcaligin-like] decarboxylase; 4: Ferrichrome-iron receptor; 5: Ferric reductase; 6: Iron-siderophore transport system, 
ATP-biding component; 7: Multidrug translocase MdfA; 8. YgiD; 9. ZupT; 10. 3,4-dihydroxy-2-butanone 4-phosphate synthase; 1 1. GGDEF domain 
protein; 12. Hypothetical protein; 13. Putative permease. 
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